Weak System Models for Fault-Tolerant Distributed Agreement Problems
نویسنده
چکیده
This thesis1 investigates various aspects of weak system models for agreement problems in fault-tolerant distributed computing. In Part I we provide an introduction to the context of this work, discuss related literature and describe the basic system assumptions. In Part II of this thesis, we introduce the Asynchronous Bounded-Cycle (ABC) model, which is entirely time-free. In contrast to existing system models, the ABC model does not require explicit time-based synchrony bounds, but rather stipulates a graph-theoretic synchrony condition on the relative lengths of certain causal chains of messages in the space-time graph of a run. We compare the ABC model to other models in literature, in particular to the classic models by Dwork, Lynch, and Stockmeyer. Despite Byzantine failures, we show how to simulate lock-step rounds, and therefore make consensus solvable, and prove the correctness of a clock synchronization algorithm in the ABC model. We then present the technically most involved result of this thesis: We prove that any algorithm working correctly in the partially synchronous Θ-Model by Le Lann and Schmid, also works correctly in the time-free ABC model. In the proof, we use a variant of Farkas’ Theorem of Linear Inequalities and develop a non-standard cycle space on directed graphs in order to guarantee the existence of a certain message delay transformation for finite prefixes of runs. This shows that any time-free safety property satisfied by an algorithm in the Θ-Model also holds in the ABC model. By employing methods from point-set topology, we can extend this result to liveness properties. In Part III, we shift our attention to the borderland between models where consensus is solvable and the purely asynchronous model. To this end, we look at the k-set agreement problem where processes need to decide on at most k distinct decision values. We introduce two very weak system models Manti and Msink and prove that consensus is impossible in these models. Nevertheless, we show that (n−1)-set agreement is solvable in Manti and Msink, by providing algorithms that implement the weakest failure detector L. We also discuss how models Manti and Msink relate to the f -source models by Aguilera et al. for solving consensus. In the subsequent chapter, we present a novel failure detector L(k) that generalizes L, and analyze an algorithm for solving k-set agreement with L(k), which works even in systems without unique process identifiers. Moreover, We explore the relationship between L(k) and existing failure detectors for k-set agreement. Some aspects of L(k) relating to anonymous systems are also discussed. This research has been supported by the Austrian Science Foundation (FWF) projects P17757 and P20529.
منابع مشابه
Failure Detectors: implementation issues and impact on consensus performance
Due to their nature, distributed systems are vulnerable to failures of some of their parts. Conversely, distribution also provides a way to increase the fault tolerance of the overall system. However, achieving fault tolerance is not a simple problem and requires complex techniques. An agreement problem known as the problem of consensus is at the heart of most problems encountered during the de...
متن کاملExploiting Omissive Faults in Synchronous Approximate Agreement
ÐIn a fault-tolerant distributed system, it is often necessary for nonfaulty processes to agree on the value of a shared data item. The criterion of Approximate Agreement does not require processes to achieve exact agreement on a value; rather, they need only agree to within a predefined numerical tolerance. Approximate Agreement can be achieved through convergent voting algorithms. Previous re...
متن کاملAn Agreement Service for Implementing Fault Tolerant Distributed Software
Distributed systems includes a large number of processors which increases the risk of failures. Fault tolerance is of a key importance in such systems. Implementing fault tolerant distributed software (FTDS) is a di cult task [2]. Group communication services [8] such as group membership and reliable multicast has been proposed to solve some of the problems in implementing FTDS. In this paper w...
متن کاملModeling Fault-Tolerant and Reliable Mobile Agent Execution in Distributed Systems
The reliable execution of a mobile agent is a very important design issue in building a mobile agent system and many fault-tolerant schemes have been proposed so far. To further develop mobile agent technology, reliability mechanisms such as fault tolerance and transaction support are required. For this purpose, we first identify two basic requirements for fault-tolerant mobile agent execution:...
متن کاملVoting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems
some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010